Toponym Disambiguation by Arborescent Relationships

نویسندگان

  • Imene Bensalem
  • Mohamed-Khireddine Kholladi
چکیده

Problem statement: The way of referring to a place in the geographical space can be formal, based on the spatial coordinates, or informal, which we use in natural language by using toponyms (place names). A toponym can represent several geographical places. This ambiguity made problematic its conversion towards a unique formal representation. Toponym disambiguation in text is the task of assigning a unique location to an ambiguous place name in a given textual context. Approach: Several toponym disambiguation heuristics assumed a geographical proximity between the toponyms of the same context. This proximity can be in terms of spatial distance or in terms of arborsecent relationships, i.e., proximity in the hierarchical tree of the world places. This study presented a new toponym disambiguation heuristic in text based on the quantification of the arborescent proximity between toponyms. This quantification was done by a new measure of geographical correlation that we call the Geographical Density. Results: Our method was compared to the state of the art methods using GeoSemCor corpus and it has outperformed them in term of recall (87.4%) and coverage (99.0%). The results showed that the toponyms of the same context are much closer in terms of arborescent relationships than in terms of spatial relationships. Conclusion: We believe that the quantification of arborescent relationships between toponyms of the same textual context is a good way to improve the recall of TD task. However, all the arborescent relationships’ types must be considered and not only the meronymy, which is the relation the most exploited in the existing TD methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text-Driven Toponym Resolution using Indirect Supervision

Toponym resolvers identify the specific locations referred to by ambiguous placenames in text. Most resolvers are based on heuristics using spatial relationships between multiple toponyms in a document, or metadata such as population. This paper shows that text-driven disambiguation for toponyms is far more effective. We exploit document-level geotags to indirectly generate training instances f...

متن کامل

A Hybrid Approach for Robust Multilingual Toponym Extraction and Disambiguation

Toponym extraction and disambiguation are key topics recently addressed by fields of Information Extraction and Geographical Information Retrieval. Toponym extraction and disambiguation are highly dependent processes. Not only toponym extraction effectiveness affects disambiguation, but also disambiguation results may help improving extraction accuracy. In this paper we propose a hybrid toponym...

متن کامل

Geo-WordNet: Automatic Georeferencing of WordNet

WordNet has been used extensively as a resource for the Word Sense Disambiguation (WSD) task, both as a sense inventory and a repository of semantic relationships. Recently, we investigated the possibility to use it as a resource for the Geographical Information Retrieval task, more specifically for the toponym disambiguation task, which could be considered a specialization of WSD. We found tha...

متن کامل

Improving Toponym Extraction and Disambiguation Using Feedback Loop

This paper addresses two problems with toponym extraction and disambiguation. First, almost no existing works examine the extraction and disambiguation interdependency. Second, existing disambiguation techniques mostly take as input extracted toponyms without considering the uncertainty and imperfection of the extraction process. It is the aim of this paper to investigate both avenues and to sh...

متن کامل

Toponym Extraction and Disambiguation Enhancement using Loops of Feedback

Toponym extraction and disambiguation have received much attention in recent years. Typical fields addressing these topics are information retrieval, natural language processing, and semantic web. This paper addresses two problems with toponym extraction and disambiguation. First, almost no existing works examine the extraction and disambiguation interdependency. Second, existing disambiguation...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010